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ABSTRACT 


Traditional approaches to database-system design and 
implementation involve single-model, single-language data- 
base systems with their inherent lack of flexibility and 
extensibility. An alternative to the traditional approach 
to database-system design and implementation is the multi- 
lingual database system (MLDS). This approach allows the 
user with the user's familiar data language to access and 
update one or more unfamiliar databases in different data 
models as if they are in the user's familiar data model. 
Thus, MLDS has the flexibility and extensibility in database 
accesses. 

In this thesis, we present a methodology for the 
relational user to access and update network databases with 
a relational data language. Specifically, we designed an 
interface for allowing the recreational/SQL user to access a 
network database via SQL transactions. This thesis further 


extends the functionality of MLDS. 
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I. INTRODUCTION 


A. AN OVERVIEW 

Databases have been an integral part of our society ever 
since records have been kept. Probably the most visible 
example is the merchants of old. They kept records of their 
customer's transactions and the balance of payments. The 
tax collector must have had a database to know who were in 
the district and what they had paid in the past. As times 
moved on databases were still kept by hand and the only 
access to them was with the user of the databases. Of 
course, copies could be made by hand: however, the point 
then was that the user would only make copies as a back-up 
for himself, since it was a time-consuming process. 

As the computer was being developed, the databases were 
no longer updated by hand. Although the computer was 
initially thought of as a means to perform mathematical 
calculations that were too complex to be done by hand 
efficiently, the computer was eventually thought of as a 
means to store and retrieve data. 

Database systems were therefore designed. The concept 
of massive storage capabilities and on-line manipulation of 
databases have brought on a myriad of database systems. 
Each database-system design was influenced by the type of 


databases utilized. “Therefore, in a large organization the 


result was that it acquired more than one type of database 
due to diverse applications. Each database type had its 
unique language that must be used to access that type of 
databases. This situation created a problem for the 
organization. To access databases of the organization a 
user must know all the respective types and languages 
associated with the databases. 

As the society has progressed in improving database 
usage the computer has provided for direct manipulation via 
complex queries which can be done in seconds that would take 
hours or days to do by hand. Further, the storage space 
required has vastly reduced and most importantly users may 
now have on-line data to work with. The problem is that 
these database systems and their databases are heterogeneous 
in that if a user wants information in a database system 
whose data model and data language are not familiar to the 
user the user cannot access and query the database. 

An approach conceived by Demurjian and Hsiao [Ref. 1] 
solves the user access problem and reduces the maintenance 
costs of database systems. The concept is to create a 
database system that supports more than one database model 
and then allow users to access any database in the system 
and manipulate the database in the user's own familiar 
language. The first stage in the process is to allow a user 
to access multiple databases if the user is familiar with 


the languages required to use these databases. This system 


is the multi-lingual database system (MLDS) (Ref. 2]. Thus, 
it is no longer necessary to have a large number of 
heterogeneous database systems to support heterogeneous 
database applications. The benefit in cost is that hardware 
is only required to support a single system, i.e., MLDS. 
Further, maintaining a single system is more efficient than 
training and providing maintenance for many heterogeneous 
systems. The second stage in the process is to provide 
cross-model accessing. More specifically, MLDS allows for 
direct accessing of multiple databases because all the data 
respond to the same data language. The user's data language 
is translated into the attribute-based data language (ABDL) 
and the database created in the user's data model is stored 
in the attribute-based data model (ABDM). This capability 
allows the user to access a database whose data model is 
foreign to the user with the user's own familiar data 
language. Thus, the cross-model accessing capability will 
extend MLDS into a multi-model database system (MMDS) (Ref. 


Br. 


B. THE MULTI-LINGUAL DATABASE SYSTEM (MLDS) 

The multi-lingual database system has been described in 
many theses prior to this one and we also briefly trace the 
path that a user will take when accessing the system. The 
structure of the multi-lingual database system is depicted 
in Figure 1. In accessing a database the user uses a user 


data language (UDL) that corresponds with the user data 
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Figure 1. The Multi-Lingual Database System 


model (UDM). The language-interface layer (LIL) identifies 
which of the two possible transaction types is requested and 
sends it on to the kernel mapping system (KMS). If the user 
wants to create a new database, KMS modifies UDM to the 
kernel data model (KDM); if the user wants to use an 
existing database, KMS modifies UDL to a kernel data 
language (KDL). The transformed request is then routed 

to the kernel database system (KDS) via the kernel 
controller (KC). When KDS has completed the task desired by 
the user, it then sends the result back to LIL in a format 


that UDM recognizes. The kernel formatting system (KFS) 


receives the result from KDS via the KC, reverses the 
process performed by KMS, and sends the result to LIL where 
the user receives the result. 

LIL is unique to a pair of UDM and UDL. Ina 
multi-lingual database system, a separate LIL is required 
for each pair of UDM and UDL; however, KDS is shared by all 
UDM. Our system currently has language interfaces for the 
relational model and SQL language, the hierarchical 
model and DL/I language, and the network model and 
COADYSYL-DML language. KDS is used to access the actual 
database and to manipulate the database by one of the 
language interfaces. All databases are stored in the 
form of the attribute-based data model (ABDM) and accessed 
by the attribute-based data language (ABDL) denoted by KDM 
and KDL, respectively, Figure 2 shows how UDMs and UDIs as 
separate entities access the same database structure. The 
present system started as a concept by Hsiao [Refs. 4,5] 
and reviewed by Rollins [Ref. 6]. Students at the Naval 
Postgraduate School have used papers and theses that provide 
guidance on the mappings of relational [Ref. 7], 
hierarchical [Ref. 8], network [Ref. 9] and functional 
(Ref. 10] databases to the attribute-based model. Students 
have mapped and implemented the following language 
interfaces: SQL-to-ABDL (Refs. 11,12], DL/I-to-ABDL 
[Refs. 13,14], CODASYL-DML-to-ABDL [Refs. 15,16]. The 


implementations have been modified by professionals and are 





Figure 2. Multiple Language Interface 


operational. Also, the language interface DAPLEX-to-ABDL 


has been designed [Ref. 17] but not implemented. 


C. THE MULTI-MODEL DATABASE SYSTEM (MMDS) 

The result of Zawis' thesis [Ref. 18] transformed MLDS 
into MMDS (Ref. 3]. His thesis provided a cross-model 
accessing capability to any relational user who wants to 
access a hierarchical database. Figure 3 shows the MMDS 
structure. The relational/SQL interface was modified by 


Zawis to include a hierarchical database type in the 







SCHEMA 
Transformer 






KDS 


Figure 3. The Multi-Model Database 


relational schema. It provides for identification of the 
data model of a database being accessed. This is needed 
Since the original relational/SQL interface is based on only 
one database model, i.e., relational. When a database is 
accessed the new interface will search for the schema of the 
database. When found the mapping schema will identify the 
database model. On the basis of the database model the 
interface branches to the appropriate procedures that allow 
manipulation of the database. This solution to cross-model 


accessing capability is effective in the current environment 


and will be expanded in future work. This process is a 
major step for mono-language users in a multi-language 
database environment. The multi-model database system is 


thoroughly discussed in [Ref. 18]. 


D. “THE MULTI-BACKEND DATABASE SYSTEM (MBDS) 

The multi-backend database system (MBDS) is not a factor 
in our research. However, it is a part of the system used 
to support our interface. We will briefly discuss the 
system to provide the reader with an overview of the total 
system. There exists more detailed discussion of MBDS in 
(Ref. 19] and [Ref. 20]. MBDS has provided a solution to a 
problem that exists in the conventional approach to database 
performance. As in Figure 4, each backend has, in addition 
to its own disk system, its own processor which are linked 
by a communications bus and controlled by a controller. 

This set up provides for parallelism which increases its 
performance gains and capacity growth. The gains and growth 


can be increased by the addition of more backends. 


E. THE THESIS ORGANIZATION 

We are adding the capability of using the relational 
language, SQL, to access and manipulate a network database. 
Prior to the thesis work, network databases are only 
accessible through the network language, i.e., CODASYL-DML. 
Our task is to design and implement the cross-model 


accessing capability between a relational language and 
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Figure 4. The Multi-Backend Database System 


network databases. 


We do not disrupt the existing cross- 


model accessing capability as implemented by Zawis. In 


Chapter II we discuss the data models with which we are 


concerned. Those are the attribute-based data model, 


the relation data model and the network data model. In 


Chapter III we discuss the strategies considered by Rodex 


and Zawis in implementing cross-model accessing 


capabilities. 


In Chapter IV we discuss our design and 


implementation on mapping a network database schema into a 


relational database schema. 


In Chapter V, we discuss our 


design and implementation of the network/SQL interface 
and modifications needed to support transaction and 
database integrity. In Chapter VI we give our conclusions 


and remarks. 


1O 


II. THE DATA MODELS 


In this chapter, we discuss the three data models and 
their corresponding data languages which are used in our 
research. Since the database being accessed is stored as an 
attribute-based database, Section A provides an overview of 
the attribute-based data model. Section B looks at the 


user's model which, in this case, is the relational data 


model. The final model covered in our research is the 
network data model. The distinction is that in our system 


we do not access a network database stored as a network 


database, but a network database that was transformed 


into an equivalent attribute-based database for storage. 


PP 


The CODASYL data manipulation language is not covered, since 
it is not relevant to our research. 
A. THE ATTRIBUTE-BASED DATA MODEL (ABDM) AND LANGUAGE 

(ABDL) 

The attribute-based data model (ABDM) originated in 
[Ref. 4]. ABDM was implemented into the kernal database 
system (KDS) discussed in Chapter I. The attribute-based 
data model has two types of data. They are the base data 
and the meta data; together they form the database. 

1. The Base Data 

A database is a collection of files. Every file has 


records. A record is a collection of attribute-value pairs 


TI 


(keywords) and the record body. The attribute-value pair is 
a member of a Cartesian product of the attribute set and the 
value domain of the attribute. Each attribute-value pair 
can only exist once in a record. No two attribute-value 
pairs in a record may have the same attribute. Directory 
keywords of a record are attribute-value pairs or attribute- 
value ranges that are stored in a directory. The attribute- 
value pairs in the records which are not kept in a directory 
are called non-directory keywords. The remainder of a 
record is textual information and comprises the record body. 
The following is an example of a record in the attribute- 
based data model. 

(<FILE,Suppliers>,<SNO,S1>,<SNAME,Jones>,<CITY,Monterey>, 

{Parts Supplied}) 
The parentheses enclose a record. The angle brackets 
enclose the attribute-valued pairs and the squiggly brackets 
constitute the record body. 
2. The Meta Data 
The meta data are stored information about the base 

data. More specifically, the directory is the collection of 
the meta data for a database. In the directory there are 
attributes, descriptors and clusters. Attributes are as 
presented previously. A descriptor describes the range of 
values or an exact value of an attribute. A cluster is a 
group of records that satisfy a unique set of descriptors. 


We use tables to maintain the directory. These are the 
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attribute table (AT), the descriptor-to-descriptor-id- table 
(DDIT) and the cluster definition table (CDT). These tables 
make up the directory, an example of which is shown in 
Figure 5. The attribute types are A, B and C. Type A are 
those with variable value- ranges. Type B are those with 
unique values. Type C have unique values as type B; 
however, the attribute value is entered at the record input 
times. Type-C attribute values are added to DDIT if they do 
not already exist: however, type A and B attribute values 
are fixed at the database creation time and will not change 
as new records are being inserted. 

3. The Attribute-Based Data Language (ABDL) 

A brief description of the five primary operations 
of ABDL is as follows: 

To insert a new record in a database, INSERT must 
proceed the record to be inserted, i.e., INSERT (record). 
An example of an insert is: 

INSERT (<FILE=Supplier>,<SNO=S1>,<SNAME=Woods>, 
<CITY=Monterey>) 

A deletion can affect more than one record. To 
specify the set of records to be deleted, we use a query, 
i.e., DELETE (query). Therefore, a delete is different from 
an insert. The former takes a query: the latter includes a 
record. 

As an example we can delete all suppliers from 


Monterey. 
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(b) A Descriptor-to-Descriptor-Id Table (DDIT). 
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Figure 5. The Directory Tables 
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DELETE ((FILE=Suppliers) and (CITY=Monterey) ) 

UPDATE is used to modify a record or a set of 
records ina database. The request consists of two parts: 
a query and the modifier. The query indicates which part of 
the database is to be modified and the modifier specifies 
how the database is to be updated, i.e., UPDATE( (Query) 
(Modifier)). 

The following is an example of an UPDATE request: 

UPDATE ((FILE=Suppliers) and (SNAME=Jones) (CITY=Carmel)) 

This UPDATE request will update the Supplier named 
Jones as being located in Carmel. 

RETRIEVE is used to gather information from a 
database. The request contains a query, a target-list and a 
by-clause. The query indicates which records are to be 
retrieved. The target-list is a list of attributes 
that are to be retrieved. The by-clause is optional and 
groups records, i.e., RETRIEVE( (Query) (Target-list) ) 
(by-clause). 

The following is an example of a RETRIEVE request: 

RETRIEVE (FILE=Suppliers) (SNAME) BY SNO 

This request will retrieve all the names of 
suppliers in the Supplier file in order of the SNO. 

RETRIEVE COMMON is used to merge two files into one 
where records from the respective files have common 


attribute values. The format of the request is as follows: 
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RETRIEVE (Query-1) (Target-list-1) 

COMMON (Attribute-1,Attribute-2) 

RETRIEVE (Query-2) (Target-list-2) 

Where the common attribute values are specified by 
their attributes, i.e., attribute-1 for the first RETRIEVE 
and attribute-2 for the second RETRIEVE. 

An example of a RETRIEVE COMMON query is as follows: 

RETRIEVE (FILE=Supplier) (CITY) 

COMMON (CITY, CITY) 

RETRIEVE (FILE=Part-location) (CITY) 

This query retrieves all suppliers and parts that 


are located in the same city. 


B. THE RELATIONAL MODEL AND LANGUAGE 
1. A Model Description 

The relational model was proposed by E.F. Codd in 
1970 [Ref. 21]. The model is a collection of tables that 
form a "flat" database, unlike the hierarchical and network 
databases that use a tree structure and a network, 
respectively. "An Introduction to Data Base Systems" by 
C.J. Date [Ref. 22] 1s recommended for further reading to 
provide an overview of relational database concepts. 

A relational database is a collection of tables or 
relations that are equivalent to files. Within each table 
there are tuples which are records of the table. A tuple 
consists of a group of attribute values and all tuples in 


the table have the same attributes. Therefore, specific 
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attribute names are the column headings of the table and 
the individual tuple are the rows. 

The tables (relations) are not connected by any 
structure. The tables are identified by their unique table 
names (relation names). The key attributes uniquely 
identify the tuples. Thus, within the table no two tuples 
may have identical values for the key attributes. 

2. The Data Manipulation Language (SQL) 

The relational data language supported by our system 
is SQL. This is a widely-used relational data language. A 
description of SQL is not provided, but examples of the four 
basic queries are given. A comprehensive discussion of SQL 
can be found in Date [Ref. 22]. 

SELECT is the command used to retrieve attribute 
values from the database. The query is structured so that a 
set of attributes is specified for a relation where the 
specific attribute values are to be selected. The relation 
name is in the FROM clause. The optional WHERE clause 
defines the attribute names to match in the relation. 

If a match exists the attribute value(s) in the SELECT 


clause are displayed. 


SELECT attribute (s) 
FROM relation 
WHERE query 


An example is as follows: 
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SELECT CIETY 
FROM Supplier 
WHERE SNAME= Jones 

This request selects the value(s) of the attribute 
CITY from the relation Supplier where the SNAME is Jones. 

The select command can be more complex and a select 
command from more than one relation is also possible. Our 
system at this time provides for up to two relations. 

The INSERT request inserts a new tuple into an 
existing table. In this situation the command inserts 
attributes with values into a relation. The attribute order 
and number of attributes must match the relation exactly. 

INSERT INTO Relation (attribute-names) 

<attribute-values> 

An example of an INSERT follows: 

INSERT INTO Supplier (SNO,SNAME,CITY): 

<'S1', 'Jones', 'Monterey'> 

This command inserts S1,Jones and Monterey into the 
Supplier relation. 

The DELETE command will remove one or more a tuples 
from a relation in a database. In the delete command 
structure all tuples with occurrences in a relation where 
the query is matched, will be deleted. 

DELETE relation 


WHERE query 
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An example of a DELETE command follows: 
DELETE Supplier 
WHERE SNO='S1' 
This DELETE command removes all tuples where SNO is 
equal to S1 in the Supplier relation. 
An UPDATE command is used to modify attribute 
values in one or more tuples. The structure of the UPDATE 
command is that a relation is updated in the modifier 


attribute with the new value where the query is true. 


UPDATE relation 
SET modifier 
WHERE query 


An example of an UPDATE command is: 


UPDATE Supplier 
SET CITY=Memphis 
WHERE SNO=S1 


This UPDATE command replaces the CITY attribute 
value with Memphis in the Supplier relation where SNO is 


equal to Si. 


C. THE NETWORK DATA MODEL AND LANGUAGE 

IHE NGIW A MODEL 

The network (CODASYL) data model is based on the concept 
of directed graphs. Graphs consist of nodes and arcs. 
Data Base Management Systems by Cardenas (Ref. 23] has an 
excellent introduction to the network (CODASYL) schema and 
architecture. The original design and implementation of a 


network database employed the most restrictive options. The 
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options involve insertion, retention and set-selection. The 
structure of a network database as it is realized in the 
attribute-based form is well described by Wortherely [Ref. 
15]. Our description follows his and Rodeck's [Ref. 10] 
work to maintain consistency. 
1. A Model Description 

The network (CODASYL) databases are networks of 
record types and set types, where records and sets are the 
entities which describe the databases. A record type in a 
CODASYL database is defined as a collection of 
hierarchically-related data item names or field names. A 
record is any occurrence of a record type and has specific 
values assigned to the data items named in the schema 
declarations. This implies that a record type is simply a 
generic name for all of the records that are described by 
the same schema. Set types in a CODASYL database indicate 
relationships between record types. They consist of a 
single record type called the owner record type, and zero or 
more record types called the member record types. Thus, a 
set type expresses explicit assoclations between different 
record types in the database. This characteristic makes it 
possible for a designer to model a large variety of 
real-world database management problems involving diverse 
record types [Ref. 15]. The many-to-many relationship is 
limited in that an owner record of a set type cannot be a 


member of the same set type. 
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Set types have occurrences just as record types do. 
Each occurrence of a set type has one occurrence of the 
owner record type and zero or more occurrences of each its 
member record types. The same restriction applies here in 
that a record occurrence cannot be present in two different 
occurrences of the same set type. This restriction 
emphasizes the pairwise disjointness of the set occurrences 
of a given set type [Ref. 15]. 

2. The Data Manipulation language (CODASYL) 

The CODASYL language is used to create a network 
database. However, this thesis is concerned only with 
accessing a network (CODASYL) database and not in the 
language that creates it, but in the relational language 
SQL. We therefore do not include a discussion of the 
CODASYL manipulation language. However, the book The 
Codasyl Approach to Data Base Management by T. William Olle 
[Ref. 24] provides a very comprehensive look at the CODASYL 
manipulation language. 

3. The AB(Network) Database 
In our implementation, a network database is stored 


aaa AA 


in the ABDM form. Thus, the database looks like an 
HE mg 





a a Taire 


attribute-based database. What distinguishes a network 


“database, say, from a relational database which is also 


en 


stored in the ABDM form is the presence of a network schema 
Tor the database. Similiarly, there is a relational schema 


for a relational database. To characterize our approach to 


21 


the support of network databases via ABDM storages and 
network schema, we refer to our network databases, the 


AB(network) databases. 
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III. THE CROSS-MODEL ACCESSING CAPABILITY 


A. THREE APPROACHES TO THE CAPABILITIES 

As more databases proliferate with their associated 
data manipulation languages, the need for a more flexible 
database system is needed. MLDS discussed earlier offers 
more flexibility than conventional database systems by 
giving the user the capabilities to access heterogeneous 
databases based on different data models with the user's 
familiar data manipulation language. The solution is to 
have: (1) the capability to access any database with a 
generic data manipulation language, (2) the translation of 
the user's data language into the kernel data language, (3) 
the presence of a database schema for the database which is 
based on the user's familiar data model. This process is 
the concept of the Multi-Model Database System (MMDS). The 
scope of this thesis is specifically concerned with giving 
the user the ability to access and query a network database 
via SQL (a relational data language). In this case the 
user's familiar data model is relational (not network) and 
the user's familiar data language is SQL (relational, not 
Codasyl1-DML). Nevertheless, we provide the user with the 
capability to access a network database as if it is a 


relational one. 
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Rodeck [Ref. 10] described various strategies for 
implementing MMDS. A summary of the proposed strategies is 
presented in the rest of this chapter. 

l. The High-Level Preprocessing Method 

This method is called high-level preprocessing 
because the processing occurs "above" the local interface. 
This means a user inputs a database name for processing. 
The local interface is searched for the database name, if 
not found locally, the other LIS are searched. When found 
in another LI, the schema transformer transforms the found 
schema into an equivalent local database schema based on 
the local data model. When the user queries the database 
via the transformed schema with the local data language 
(say SQL), the second component of LI, language translator, 
translates the queries (say, Codasyl1-DML) into the 
equivalent queries in the local data language for accessing 
the database. The third component of LI is the result 
formatter. It formats the results of a query into a form 
the user can recognize. For example, if a user wanted to 
access a network database via SQL transactions, the results 
would be returned to the user ina table form, instead of 
the network form. Figure 6 depicts this processing 
strategy. 

2. The Mixed-Processing Strategy 

This method of processing differs from the previous 


method in that there is no language-translation step. There 
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Figure 6. The High-Level Preprocessing Strategy 


are two components involved in the process, the schema 
transformer and a second language interface. See Figure 7. 
Similar to the preprocessing strategy, when a user inputs a 
database name for processing, the local LI is searched; if 
the database is not found, other LIs are checked. After the 
database is found in another LI, a copy of its schema is 
transformed into an equivalent schema and placed in the 
local LI. When queries are entered against the transformed 
schema in the local data language, the local LI processes 


the request without the need of any translation. The 
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Figure 7. The Mixed-Processing Strategy 


output also requires no reformatting, since it is in the 
form of the local database model. 

3. The Postprocessing Strategy 

The last strategy to be examined is the 

postprocessing strategy. In this strategy, the schema 
transformation takes place from the schema of the database 
to be accessed to the schema of the local LI, i.e., 
transforming a schema of a heterogeneous database into a 
schema in the form of the user's familiar data model. This 


strategy is called low-level because it occurs below the LI 
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layer as illustrated in Figure 8. The language-translation 
portion of the strategy takes place in exactly the opposite 
direction to the schema transformation. The translation 
takes place from the local database language transactions 
to the equivalent database language transactions of the 
heterogeneous database. The result formatter outputs the 
results in the form of the local II, i.e., use the 


transformed schema for output formats. 


B. THE CHOSEN APPROACH 

Because the postprocessing strategy involves both the 
schema transformation and transaction translation as 
discussed in Zawis [Ref. 18], it is too complicated to be 
used in our implementation. 

The preprocessing methods involve translating the syntax 
of one data language into the syntax of another language; 
this is also a very complicated and time-consuming process. 
It is also ruled out as a viable strategy. 

Thus, the mixed processing is chosen, since it does not 
require language translations for the same transaction. The 
last characteristic of the mixed-processing strategy is that 
this strategy requires less modification of existing code 
than the other strategies. The existing mixed-processing 
strategy for the network-to-relational transformation is 
very similar to the hierarchical-to-relational transforma- 
tion. The major difference is the manner in which records 


are searched. In the hierarchal model, the user is 


2 


LI LI} 


Schema 
Transformer 








Language 








Translator 





Results 
Reformatter 





Figure 8. The Postprocessing Strategy 


concerned with a single parent-child relationship. In the 
network model, the user has to search via sets and must also 
be concerned with records having multiple parents. However, 
once the desired record is found, the manner of retrievals 
is very similar. We can refer to Zawis' [Ref. 18] work on 
the hierarchical-to-relational for our work on the network- 


to-relational transformation. 
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IV. ON TRANSFORMING A NETWORK SCHEMA TO ITS 
RELATIONAL EQUIVALENT 
A. THE DESIGN 

The first step in the mixed-processing strategy is to 
perform the schema transformation. In our case, it is from 
a network schema to an equivalent relational schema. This 
1s accomplished by translating the data relations in the 
network database to their equivalents in the relational 
model. 

The network-to-relational transformation process will be 
illustrated by first describing a typical network database. 
Figure 9 illustrates the sample B be transformed. 
The schema that describes the network database is "sps." 


ure 10 depicts this database definition. There are three 


record types called SA (Supplier), PA (Parts), SP (Supply). 





These record types will be functionally represented in a 
relational schema by tables. The "duplicates-are-not- 
allowed" declarations for SNO and PNO in respective SA and 
PA record types implies these attributes are key fields that 
uniquely describe an entity or record in question. 
Therefore, when desiring to insert, the program must check 
to see if the insert request has an attribute value that 
already exists in the database. The record attributes are 


defined by type and length. 
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SPS DATABASE 
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Figure 9. Supplier/Parts Network Database 


The set types are now defined. Their purpose is to 
describe a relationship among record types. The two set 
types defined are SSP and PSP. Each set-type declaration 
Will include the following: owner-record-type name, member- 
record-type name and insertion and retention rules. The 


particular details of each set type will differ, depending 
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schema name is SPS; 
record name is SA; 
duplicates are not allowed for SNO; 
SNO ; character 10. 
SNAME ; character 10. 
record name is PA; 
duplicates are not allowed for PNO; 
PNO ; character 10; 
CITY ; character 10; 
record name is SP; 
QTY ; fixed 4. 
set name is SSP; 
owner is SA; 
member is sp; 
insertion is automatic 
retention is fixed; 
set selection is by value of SNO in SA; 
set name is psp; 
owner is PA; 
member is SP; 
insertion is automatic 
retention is fixed; 
set selection is by value of PNO in PA; 


Figure 10. The Network Database Definition 
of a Sample Database 


31 


on the circumstances. The owner-name and member-name 
statements simply define a static relationship among 
existing records (i.e., occurrences) of the two record 
types. 


The statement, insertion-is-automatic in set types SSP 


and PSP, means every record added or modified which 
represents a record type or subtype, must belong to a 
particular set. The statement, retention-is-fixed, 
requires a member record reflecting that a record subtype 
always belongs to the same owner-record type. 

The last statement, set-selection-is-by-value, declares 
that when a record is inserted into a set, the set must be 
the current set type of SNO in SA and likewise PNO in PA. 
In simpler terms, this means each supplier and part will be 
inserted in the sets based on the owner record types. 

In transforming an existing network schema to a 


functional equivalent relational schema, various key issues 


must be observed. The relational database model has the 
characteristic commonly referred to as flatness. Flatness 
means that the tables (1.e., record types) have no 
(structural) relationships from each other. Whereas ina 
network or hierarchal database, the record and segment 
types are structurally linked. In the relational database 
environment, if structural relationships between tables are 
desired, data manipulation constructs such as the JOIN and 


VIEW are used. 
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As Zawis [Ref. 18] indicated in his work on the cross- 
model transformation (e.g., RELATIONAL-TO-HIERARCHICAL), 
there are key issues that also must be adhered to, in order 
to preserve the structural integrity of the mapped (i.e., 
in our case the network) schema to the equivalent 
(relational) schema. 

One key issue is in the case of the network schema 
maintaining owner-set-member relationship in the network 
database. Similar to Zawis [Ref. 18] proposal for 
performing his transformation of hierarchal-to-relational, 
there are two methods that must be examined. His first 
method of schema transformation is to create a relational 
table for each relationship required in the given database. 
The fact that a network database has many-to-many 
relationships among its records; i.e., a record can have two 
owners and these owners can have parents, the proliferation 
of tables would make the representation not very cost 
effective. The second reason his method is not a sound 
method for our purposes is due to the numerous tree 
structures (hierarchies) in a network database, thus making 
queries against the database a very long and complicated 
process. 

In our design, the method of schema transformation is to 
cascade data in key fields in the network records to form 
primary keys of equivalent relational tables. Figure 11 


illustrates the cascading method. Key fields are defined as 
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fields that uniquely identify the corresponding records. 
They must remain consistent throughout the transformation. 
Figure 12 is an illustration of the transformed network 


database to its equivalent relational database. 


database name = SPS 


number of relations = 3, number of views = O 
database type = NETWORK 
relation name = SA, number of attributes = 3 
attr name = SNO , ctype = s, length = 10, Key = TRUE 
attr name = SNAME ¿type = s, length = 10, key = FALSE 
attr name = CITY ¿type = s, length = 10, key = FALSE 
relation name = PA, number of attributes = 3 
attr name = PNO ¿type = s, length = 10, key = TRUE 
attr name = SNO ¿type = Ss, length = 10, key = TRUE 
attr name = QTY ‚type = s length = 4, key = FALSE 
relation name = SP, number of attributes = 3 
attr name = PNO ,ctype = s, length = 10, key = TRUE 
attr name = SNO ¿type = s, length = 10, key = TRUE 
attr name = QTY ¿type = 1, length = 4 , key = FALSE 


Figure 12. The SPS Database Mapped from a 
Network Schema to an Equivalent 
Relational Schema 
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B. AN IMPLEMENTATION 

The implementation of the mixed-processing strategy 
required modification to the language interface layer (LIL), 
the kernel mapping system (KMS) and the kernel controller 
(KC). These modifications are made with little modifica- 
tions of the existing Relational/SQL interface. This 
chapter will give a sumary of the major data structures 
implemented in the new LIL as well as the flow of 
executions from LIL to the parser for the syntactical 
verification and execution of a SQL transaction. 

1. The Lanquaqe-Interface (LI) Structures 

The language interface layer (LIL) is one of the 

most important layers in MMDS mainly because this layer 
directly links the user to the system. Upon an initial 
sign-on onto the system, there are numerous data structures 
that must be initiated in order to give the user A 
the multiple databases. All the data structures in LIL 
will not be examined herein; however, the most important . 
structures will be presented. The first structure present 
is the dbid_node in Figure 13. This structure points to a 
list of all the schemas that have been defined for all 
modeled databases. The structure also gives the user the 
ability to access all of the databases databases in the 
MMDS environment. After the user inputs a database name, 
the system searches the current list of databases based on 


the requested interface first. For example, if the user has 
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union dbid node p | 
( 1 4 
struct rel dbid node “rel, 
struct hie dbid node *hie; 
struct net dbid node *net; 
struct ent dbid node kent, 


) 


Figure 13. The dbid node Structure 


requested a relational interface and then inputs a database 


name to be loaded onto the meta-data disk, the system would 


a 


first search the list of relational database names Via-ithe 


dbid node structure. If the database name is not found in 


BANG... 
the list of relational database names, it then searches the 
other defined database names. 
The next data structure is the rel dbid node which 
eee 


points to the first relational database. This structure is 


depicted in Figure 14. It is | the T data structure 


for all the schemas Ae aoe for the Ea databases. 


ARA ar 
d 


The structure contains the name of the database, the number 
of relations, a pointer to the first relation, a pointer to 
the current relation, a pointer to the next database schema 
and, based on Zawis' first implementation of cross modeling, 
a DBTYPE. The DBTYPE tells the user the name of the 
original database prior to the transformation. 
The data structure rel node shown in Figure 15 

describes each of the A e a database and is 


initialized with information available from the equivalent 


record data structures in the network schema. The relation 
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struct rel dbid node 


{ 


char name[DBNLength + 1]; 
te num_rel; 

struct rel node *first rel; 

struct rel node #GUET reir 

struct rel dbid node #next db, 

int dbtype 


} 


Figure 14. The rel dbid node Structure 


struct rel node 


( 


char name(RNLength + 1]; 
int num_attr; 

struct rattr node *first aten 

struct ratty node *xcūrr attri; 

struct rel_node *next irel 


} 


Figure 15. The rel node Structure 


name is set equal to the network record name and pointers 
are set to the first attribute of the relation and to the 
next relation, if any, in the schema. The network records 
are mapped to this data structure via the set-member 
relationship. 

The user info data structure uniquely identifies a 
particular relational user. It also links the user to the 
linked list of other users on the multi-user environment. 
Figure 16 depicts this data structure. 

The last data structure examined is the sql info 
data structure. It contains pertinent information about the 


current database that the user is using, and information 
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struct user info 


( 


char uid[UlDLength + 1]: 
union li_info li type: 
struct user info *next user, 


) 


Figure 16. The user info Structure 


about the sql transaction. This structure is shown in 


Figure 17. 


struct sql info 


( 


struct curr db"info CUETITOD, 
struct file info file: 
struct ran info sql tran: 
int operation; 
struct adil info *ddl files; 
struct tran info *abdl tran; 
struct kms info kms data: 
union kfs info kfs data: 
union kc_info kc_data; 
int error; 


) 


Figure 17. The sql info Structure 


2. The Schema Transformation 
As previously mentioned, LIL is the most important 
layer in MBDS. It is from this layer the user logs onto the 
system and tells MBDS what type of tasks to accomplish. The 
user can load new databases, access previously created 
databases, and access information from an existing database. 
The flow of control is sequential. Control always returns 


to the menu-driven LIL. The user will always exit the 
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system via the top level menu. The user can step back to 
top-level menu because an exit routine is provided at each 
level of menus. 

The first menu, a user will see, gives the user 
options to load a new database, process an old database or 
return to the operating system. If he chooses to load new 
data into an existing database, the list of relational 
schemas is searched for the appropriate name. If the 
database name is found, the schema is loaded and processing 
may take place. 

If the database name is not founded by searching the 
list of relational schemas, the system will search the list 
of schemas defined as network, hierarchal or functional. If 
found, it transforms the found schema to a functionally 
equivalent relational schema to facilitate SQL transactions. 
If the name of the selected database is located in the list 
of network databases, the data structure rel dbid node is 
appended to the end of the list of relational databases with 
the DBTYPE field having the value NET. The new data 
structure, rel node, is now attached to the schema. The 
relational name of the SOL transaction is compared to the 
record name in a set type. If a match is attained, then the 
relation name in the SQL schema is equivalent to an owner- 
record name declared in a set type. If the relation is not 
found in the set type, then a comparison would be made on a 


member record. When found, the network record is set to 
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equal to the SQL record and pointers are set to point the 
first attribute of the SOL record and the next relation. 

The rattr node data structure describes all the 
attributes associated with a relation. Figure 18 depicts 
this structure. Each attribute is represented by a unique 
rattr node which contains a name, type, and length. The 
attributes are mapped directly from the network attribute 
node to the relational attribute node (struct rattr node). 
If the network attribute is a key field, then the attribute 
is flagged in the relational schema with key attribute 


having a value of "true." 


struct rattr node 


( 


char name[ANLength + 1); 
char type; 

int length; 

int key flag; 

struct rattr node *next; 


) 


Figure 18. The rattr node Structure 


The cascading mechanism is used to map the key from 
the root nodes to the leaf nodes in the network structure. 
This is necessary in order to maintain the integrity of the 
network database. Even though relational schemas do not 
recognize networks set-owner-member relationships, cascading 
the key fields into the relations by the convention of key 


attributes captures this relationship. Upon completion of 
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the cascading sequence, the number of attributes is set to 
equal to the number of attributes in the associated network 
database plus the cascaded fields. The mapping and 
cascading continues until all the relations are completed. 
Upon completion, control is returned to LIL where the user 


can query the database via SQL transactions. 
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V. MAPPING SOL STATEMENTS TO ABDL STATEMENTS 
FOR ACCESSING A NETWORK DATABASE 


In Chapter IV we have discussed the schema transforma- 
tion process needed to implement the mixed-process strategy. 
Figure 19 depicts the scheme used to complete the cross- 
model accessing of an AB(network) database. The forms in 
solid lines represent existing implementation and those in 
broken lines represent our work. The relational schema 
presented to the user is transparent in that the network 
schema appears to be a schema for a relational database. 
When a user creates an SQL transaction to access or update 
an AB(network) database, the existing Relational/SQL 
interface cannot be used, since it translates the SOL 
transaction into an ABDL equivalent for an AB(relational) 
database. Ours is an AB(network) database. Thus, the new 
network/SQL interface uses the original relational/SQL 
interface with some modifications which allow the interface 
to identify the schema created for the AB(network) 
database, instead of the ones for the AB(relational) 
databases. Note that the entire process does not involve 


the the network/CODASYL interface at all. 


A. THE TRANSLATION PROCESS IN LI 
Zawis [Ref. 18] presented two methods that the new 


language interface (LI) can be implemented. The first 
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Figure 19. The Cross-Accessing Language Interface Design 


method is to created a separate language interface 
(LIL,KMS,KC and KFS) for each data model. The LI used would 
be determined by the database model selected. The second 
method is to use the existing LI and modify it to by the 
user input. The second method is to use the existing LI and 
modify it to branch to the appropriate translation and 
processing activities as required by the user input. The 
latter method requires less implementation. Zawis chose to 
modify the existing LI to reduce the code size required. 
Similarly, we choose the same path. The consistency in 
improvements to MDLS and the existence of previous work are 


primary factors in determining the chosen method. 
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The SQL statement is sent to the relational/SQL 
interface. The relational/SQL interface branches to an 
internal network/SQL interface that ensures the needed 
checks and modifications of a network database are performed 
in KMS. 

1. Query Processing in the LI 

The user will load a database or select an existing 
database to process. The user is then given a menu to 
choose follow-up actions: 

Enter mode of input desired 

(f)--read in a group of transactions from a file 

(t)--read in transactions from the terminal 


(x)--return to the previous menu 


ACTION ----> _ 
The user can use a prepared list of queries or create 
queries from the terminal. In any case a list of 
transactions are displayed on the terminal, with a number 
associated with each query. The following action menu is 
presented to the user: 
Pick the number or letter of the action desired 
(num) --~execute one of the preceding queries 
(d)--redisplay the list of queries 


(x)--return to the previous menu 


ACTION ----5 _ 
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The user can now select a query to process. The order of 
processing is not important if the query does not rely on 
another query. As an example a user cannot retrieve a file 
if it has not been created first. The query is sent to the 
kernel mapping system (KMS) for translation, and then to the 
kernel controller (KC) for execution. The results are 
returned to the user on the terminal and the action menu is 
presented for user selection. 

2. Query Processing in LI 

Previous theses provide descriptions of the query 
process in detail. In Benson and Wentz [Ref. 14], Emdi 
[Ref. 16] and Kloepping and Mack [Ref. 12] detailed 
descriptions of KMS are presented. The following is an 
overview of the query process based on the designers and 
implementors of MLDS. Zawis [Ref. 18] composed an excellent 
overview and we follow that overview. 

SQL transactions are sent to KMS from LIL. The KMS 
function is to: (1) parse the SQL query and ensure the 
query syntax is correct, and (2) translate the query into an 
equivalent ABDL transaction. A valid query is sent to the 
kernel controller for processing by the kernal database 
system KDS, i.e., MBDS. 

The KMS parser uses the Yet-Another Compiler 
Compiler (YACC). YACC is a program generator that performs 
a process on a stream of tokens that produces a parser that 


is syntactically correct. The Yacc-produced parser is a 
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finite-state machine and performs a top-to-bottom parsing. 
The parser searches from left to right and with one token 
look-ahead. When tokens are recognized, portions of the 
output code may be executed or propagated up the hierarchy 
until a higher-level rule is satisfied. If the token string 
has successfully been processed then the parser terminates 
normally. If a syntax error is issued, the parser returns 
to the calling procedure. 

The KMS data structure primarily consist of five 
data structures used during the parser process. The 
rel kms info is depicted in Figure 20. This structure 
contains information accumulated in the parser process to be 
used later. Attribute names used in the Select and Insert 
operations are held in the target list, the names of the 
relations being accessed are stored in templates, and Insert 
request attribute values are maintained in the insert list. 
The temp str stores intermediate translation results and the 
join str is needed to hold the translation of a second 
retrieve request of a join operation. The next nest field 
is a pointer to the next rel kms info structure in the list 
of a nested Select transaction. The last field, alt tgt 
holds information relating to the translation of non 
AB(relational) statements. 

The four data structures pointed to by rel kms info 


are depicted in Figure 21. The relational KMS data 
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struct rel kms info 


( 


struct target _ list_info #fITST tots 
struct _ Gtemplates_info templates; 
struct insert list info xfirst val, 
char *temp str; 
char *jJoin str; 
struct rel kms info *next-nest; 
struct alt_list_info alce 


) 


Figure 20. The rel kms info Data Structure 


structures are covered extensively in Kloepping and Mack 
(Ref. 12]. Since we are accessing an AB (network) database 
the modifications are primarily concerned with the KMS 
parser. The branching in KMS is modified so that the 
SQL-ABDL AB (network) model. “The remainder of the chapter 
describes the design considerations and implementation 
details involved in mapping the four primary SQL 


transactions into AB (network) equivalents. 


B. THE SELECT STATEMENT 


1. The Design 
The SOL select command retrieves information from a 
database. Retrieval of information does not alter the 


database. Information is retrieved without modification, 
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target list info 

( 
char 
char 
struct target_list_info 
} 

templates info 
( 
char 
char 
) 

insert list info 
( 
char 
struct insert into info 
) 

eee list info 
{ 
char 
char 


struct alt list info 


) 


name (ANLength + 1); 
tgt_rel(RNLength +1); 


*xnext_attr;}; 


namel (RNLength + 1); 


name2 (RNLength + 1); 


*value; 


*next val; 


name(ANLength + 1); 
op(RNLength + 1); 


*next attr; 


Figure 21. KMS Parser Data Structures 


when accessing an AB (network) database. The transforma- 


tion of the AB (network) schema into an equivalent AB 
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(relational) schema provides the user with a view of the 
attributes required to retrieve information, the key 
attributes are cascaded in the mapping. The DBKEY used in 
the CODASYL language interface is system generated 
and not provided to the user and therefore the DBKEY for the 
record must be added to the retrieve request before it is 
sent to the MBDS. The modification is not needed at root 
records, since theydo not have owner records and the 
relational user is given the actual attributes of the 
record. 
2. An Implementation 

Database integrity is not an issue in the select 
statement. The method used to implement the select is to 
branch to a retrieve net procedure in KMS and add the record 
DBKEY to the ABDL request and send the request to MBDS. 
Figure 22 is an example of a select transaction and its 


equivalent ABDL transaction. 


SELECT sno,sname 
From sa 
WHERE city = 'London' 
Retrieve((TEMP = SA) and (CITY= London)) (SNO,SNAME) BY 


DBKEY 


Figure 22. A SQL Select Transaction 
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C. THE INSERT STATEMENT 
The SQL insert statement is used to add information to 
an existing database. The database is modified and the AB 
(network) database integrity may be violated by the insert 
statement. The record-type-and-set-type relationships have 
to be maintained. The relational user does not know the 
notion of record types and set types. Nor does the 
relational user know their relationships and the 
restrictions that apply. 
1. The Design 

As mentioned in Chapter II the implementation of the 
AB (network) model is restricted to certain options. Here, 
the type of insertion is automatic in the AB (network) 
model. This means that insertion is based on set selection 
criterion. That is an occurrence of a record may not be 
inserted into a member record if a set type does not exist 
for the occurrence. The network/SQL language interface 
is needed to determine if the insert statement will violate 
the AB (network) integrity. The interface must retrieve the 
record DBKEY and add it to the insert statement before 
sending the insert statement to MBDS. The SQL user inserts 
the following transaction: 

INSERT INTO sp (sno,pno,qty): 

<1S1','P1',300> 

The AB(network) model only allows for automatic 


insertion. If the insert is into a member record proper 
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caution is taken to ensure that the record is placed in the 
proper set occurrence. The set selection mode must be 
considered. 

a. The STORE-by-Application Statement 

This method looks for proper set occurrence and 
then inserts the record. 

b. The STORE-by-Value Statement 

This method adds the requirement that the owner 
of the proper set occurrence must be located prior to 
insertion. 

C. The STORE-by-Structure Statement 

This method is similar to store-by-value except 
that the values of the owner and member attributes must 
match. 

KMS only allows for the store-by-application 
method of set selection to be used in the present 
implementation. 

2. An Implementation 

By design, the AB(network) database accepts inserts 

if all set occurrences are proper. The insert rel to net 
procedure traverses the AB(network) database to determine if 
the insert is valid. The first step is to evaluate whether 
the record type is a root recora(owner only) or member 
record (can be owner also). This is accomplished by 
searching the set types using member name to match the 


record type. If no match occurs then the record type does 
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not exist or it is a root record. In the latter case, the 
insert statement is then processed. Conversely, if a record 
type is found, it is a member. The insert must match 

the owner(s)' set type(s). This requires that a search of 
the owner(s) record(s) of the member record have the key 
attribute(s) used in the insert statement in the owner(s) 
record. This is accomplished by a RETRIEVE request for each 
owner record based on the attribute on which the set type is 
based. Each RETRIEVE is sent to a buffer to be sent to 
MBDS. 

When the AB (network) translation is complete the 
parser completes its operations and control is given back to 
LIL. The KC then receives the linked list of ABDL requests 
from LIL. KC recognizes that an AB(network) database is 
being accessed and branches to a procedure that passes the 
RETRIEVE requests to MBDS. The record retrieved, if 
any, is sent to a buffer. The buffer is checked for at 
least one record. If the buffer is not empty the insert 
transaction is transmitted to the KC for processing. If the 
buffer is empty the user is informed that the insert as 
requested will violate the AB(network) database. 

The following is a sample of the terminal display that the 
SQL user receives. 
UNABLE TO COMPLY WITH REQUEST--to insert a network 


member record, an owner record must exist. 
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D. THE DELETE STATEMENT 
1. The Design 

The purpose of the SQL Delete is to delete records 
from a relational database. However, deleting records from 
a network database involves more than just deleting records. 
First, a database modification (in our case, deletion) 
requires checking to ensure that the network integrity is 
maintained. This process involves checking whether a target 
record iS a parent record, if it is a parent, all the 
children must also be deleted. Second, because a network 
database has many-to-many relationships between parent-child 
records; it necessary to update (i.e., delete) all the 
associated occurrences in subsequent tree structures. 
Third, translate the SQL Delete into a number of AB(network) 
Deletes. For example, suppose a user performs the following 
Delete transaction on the network database in Figure 23: 


DELETE SA 
WHERE SNAME = '‘'IBM' 


If the supplier record 'IBM' is deleted from SA, the 
occurrence of 'SS2' in SP no longer has an associated parent 
in SA (i.e., integrity violation). In our design, we will 
delete the specified record and all associated occurrences. 
The primary tasks in executing a SQL Delete 
transaction are to provide integrity checks on the network 
database, translate the SQL Delete to an equivalent set of 
AB(network) Deletes and finally perform the delete 


transaction. 
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SPS DATABASE 


PA 
pao ger 


SANJ ET BOLT | SANJ 


Figure 23. A Sample Network Database Prior 
to a Delete Operation 


2. An Implementation 


AS described earlier, prior to executing a SQL 


(delete), a sequence of events must be accomplished. One of 


these events is integrity checking. Integrity checking 


consists of multiple deletes on parent-child related 


records. These multiple deletes are accomplished by 
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buffering of intermediate results associated with records at 
different levels in the network tree structure. 

The existing hierarchical language interface, 
contains the necessary logical concept to perform the Delete 
operation in our network database. This interface will 
allow us not to duplicate and integrate different code 
(i.e., mixed-processing strategy). 

The operations necessary to accomplish a Delete 
transaction is dependent on the location of the occurrence 
in the network tree structure. If the occurrence is located 
at a leaf node in the network structure, then only a single 
delete transaction is needed. However, if the occurrence is 
located in a non-leaf node position in the network tree 
structure, multiple Retrieve and Delete transactions are 
needed. 

The Delete operation requires multiple retrieves 
because the user does not provide all the necessary 
information required for the operation. As a relational 
user, the user does not know what records are assoclated 
with each other (in given many-to-many relationships) in 
the network structure. We will use the Retrieve operation 
to gather all the records associated with the target delete 
record. The retrieves take place at each level of the 
network structure: the results are then stored in the KC 
(Kernel Controller) buffer for later processing. An example 


of the transactions required for the Delete operation 
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discussed earlier is illustrated in Figure 20. The first 
value retrieved is <SS2,IBM,SANJ>. The key value, 'SS2' is 
then used to delete occurrences in SP. The Delete operation 
is now complete. This is a very simple example of a network 
database in order to keep the details of the implication to 
a minimum. However, in a more complicated network 
database, execution of the Delete operation would continue 
to sibling and child records of SP. A Retrieve operation 
is needed at each level utilizing the 'SS2' value from 
the previous Retrieve. Following execution of these Delete 
operations in SA, the processing returns recursively to the 
associated record buffers with the specified 'SS2' value. 
Upon completion of the branch, execution then goes to the 
next branch for retrieval and Deletion using 'SS2' value 
until no branches in the tree remain for processing. The 
deletes are then sent to MBDS for execution. 

The execution of the Delete operation starts in the 
KMS parser. The user specified Delete transaction is parsed 
and verified to be a legal Delete operation. A legal Delete 
operation is an operation that the parser recognizes based 
on existing grammar rules in the parser. Once the parser 
verifies that the operation is a Delete on a Network 
database, execution then branches to a routine that 
converts the AB(relational) Delete to an equivalent 


AB(network) transaction. 
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The remaining AB(network) Delete transaction now 
needs data structures in order to create and execute the 
Delete operation. These structures are built from the 
network database schema. The transaction now is complete 
and is sent to the KC for execution. 

Upon completion of the Delete, the network data 
structures are released and the allocated memory returned to 
the operating system. The KMS then resumes processing and 
the relational data structures are re-initialized. In 
completion, control is returned to LIL for input from the 


user. 


[RETRIEVE ((TEMP = SA) AND (SNAME = IBM)) (SNO) BY SNO] 


[DELETE ((TEMP = SA) AND (SNO = *x))] 


[RETRIEVE ((TEMP = SP) AND (SNO = **)) (SNO) BY SNO] 


[DELETE ( (TEMP = SP) AND (SNO = **) AND (PNO =*%)) ] 


** is the place-holder for the value of SNO supplied by the 
prior retrieve statement. 


Figure 24. A Sample AB(network) Delete Transaction 
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E. THE UPDATE STATEMENT 

The SQL Update operation is used to modify attribute 
values ina relational database. If multiple values are to 
be updated, a sequence of Update transactions must be sent 
to MBDS. The Modify statement in Codasyl-DML is the 
equivalent to this (relational) statement. 

1. The Design 

The Update transaction is limited to non-key 

attributes. Non-key attributes are the attributes not 
needed to maintain the integrity of the network database. 
As a result, modification of key attributes values will 
cause corruption of the network database integrity. For 
example, if a key attribute field is changed in a record 
with children, the ancestor tree associated with the new 
value will be incomplete, and the existing children are no 
longer linked to a valid parent in the network structure. 
As documented in earlier work, our implementation of the 
Update operation remains consistent with this constraint. 

2. An Implementation 

The fact that we are constrained to only updating 

non-key attributes, our Update translation is achieved 
within the relational interface. LIL forwards the 
transaction to KMS for parsing and syntax verification. 
After the parser recognizes the transaction as a legal 
Update on a network database, a search routine is called to 


search the database schema for the desired attribute. If 
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the target attribute is a key attribute, then the user will 
be sent the following message: 

UPDATE not allowed. Updates only allowed 

on NONE KEY attributes only. 
The request is aborted sana control is returned to LIL for 
further user input. If the attribute found is a non-key 
field, then the Update is mapped to an equivalent 
AB(network) Update and passed to KC for execution. 

In terms of future work, the Update could be 
modified to allow updates on key-attributes. This would 
consist of a series of Retrieve and Update operations 
similar to the Delete statement. Update statements could 
then be generated that will modify the cascaded key- 
attribute values of all descendent records in the network 
tree. In addition, it will be necessary to execute a 
retrieve on the ancestors of the record to be updated. The 
returned records are then stored in the KC buffer, thus, 
the complete ancestor tree for the new attribute value is 


established. 
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VI. CONCLUSIONS 


Traditionally, the design and implementation of a 
conventional database system begins with the selection of a 
data model, followed by the specification of a model-based 
data language. An alternative to this traditional approach 
to database system development is the multi-lingual database 
system (MLDS). This alternative approach affords the user 
the ability to access and manage a large collection of 
databases via several data models and their corresponding 
data languages. This alternative approach has been 
designed and implemented at the Laboratory for Database 
Systems Research, Naval Postgraduate School, Monterey, 
California. Figure 25 depicts the multi-lingual database 
system. 

MLDS restricts users to access individual databases with 
their respective data languages. For example, a network 
database can only be accessed via the network-data-model- 
based Codasyl-DML language. The extension of MLDS will 
Support cross-model accessing of all the databases. The 
scope of this research is the design and implementation of 
an interface to support the access of a network database via 


SQL transactions. 
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| 
Relational Hierarchical T 


KDS (Atrribute-Based Model!) 


Multi-Backend Database System 





Figure 25. The Multi-Lingual Database System Concept 


A. A REVIEW OF THE RESEARCH 

We have presented three strategies for implementing this 
interface, inc luding high-level preprocessing, mixed- 
processing, and post-processing, prior to selecting the 
mixed-processing strategy as the most viable strategy. We 
have related our research with those research on cross-model 


accessing. 


B. WHAT WE HAVE ACCOMPLISHED 
The mixed-processing strategy involves two components. 
First, the schema transformation consisted of a methodology 


to map the network schema to a relational schema. This was 
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accomplished by cascading key fields from the network schema 
to the relational schema; thus maintaining the owner-member 
relationships by keys. Second, we described the data 
structures and implementation details necessary to 

integrate the schema transformer in the Language Interface 
Layer (LIL). 

The new language interface provides the capability of 
manipulating a network database via SQL transactions. This 
is accomplished by the translation of SQL transactions to an 
equivalent AB(network) transaction. We then detailed the 
changes to the existing relational-to-hierarchical language 
interface in order to provide us with cross-model accessing 
capability (i.e., relational-to-network). We then conclude 
our work by describing the four basic relational 
transactions, Select, Insert, Delete, and Update, in terms 
of the amount work entailed in the language interface and on 
the network database. 

Our efforts at the Laboratory for Database Systems 
Research illustrates that a multi-model database system 
(MMDS) could be designed and implemented using existing 
software and the potential for further extension of MLDS is 


only limited by the motivation for research in this area. 
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APPENDIX A 


THE RELATIONAL DATABASE STRUCTURE 





MODIFIED SYSTEMS CATALOG IN THE RELATIONAL/SOL LANGUAGE INTERFACE 


(incorporated to support views) 
o ee 
u h tps. h sal. 
si_eurr_db.cdi_de 3 OCT 1988 
ma 


rn. type 


ga Sones rdninum view A character, either 'T' 
ran .nams r z Ky il 
(Database The number of views in S K 0.651 T) 
name: the database. PLE or VISS 





rdn.num.rsl 








rn_num_view_def 


The number of ABDL re- 






rn next 


PR. EUr F 
rel quests making up the view 


attr 





rn first 
ran.first rel 





definition. 





rdn_curr_rel 
a view def 
T A pointer to the ABDL 
request describing the 


ran_ imp, table ran_imp_name 


view. 


The table in the ABDL The attribute in the ABDL 
request to which ran, request, located in view 
def, E to the 


attrihute in ran name. 





rdn_db,t;p» 


imp_name belongs. 





64 


APPENDIX B 
THE NETWORK DATABASE STRUCTURE 
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